perm filename AIH[AM,DBL]1 blob sn#400101 filedate 1978-11-30 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00012 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	AM article for AI Handbook
C00008 00003	.sec(Plausible Reasoning in Mathematics)
C00016 00004	.sec(Discovery in Mathematics)
C00023 00005	.chap(|DESIGN OF THE ↓_AM_↓ PROGRAM|)
C00025 00006	.sec(Representation)
C00029 00007	.sec(Control)
C00043 00008	.chap(RESULTS)
C00050 00009	.sec(AM as a Mathematician)
C00058 00010	.sec(|Limitations of ↓_AM_↓|)
C00064 00011	.sec(Conclusions about ↓_AM_↓)
C00067 00012	REFERENCES MENTIONED BY NUMBER IN THE TEXT
C00084 ENDMK
C⊗;
AM article for AI Handbook

INTRODUCTION

Few doubt the ubiquity of %binductive inference%a, but nowhere is the  use
of inductive  reasoning  so  explicit  as in  the  process  of  scientific
research. The  scientific  method  reads  like  a  recipe  for  induction:
constrain  attention  to  a  manageable  domain,  gather  data,   perceive
regularity in it, formulate hypotheses, conduct experiments to test  them,
and then use their results as the new data with which to repeat this cycle
again.

This suggests that a  good task domain in  which to investigate  inductive
thinking is  Science  itself.  Thus, one  expects  to  find  psychological
studies of scientists %bin vivo%a, and AI programs which carry out  simple
kinds of scientific research.  Attempts at both have been illuminating but
quite rare,  however.   See  [ref. to  Dendral,  Meta-Dendral,  Crysallis,
Mycin, Prospector,...]


We believe that Mathematics is  a good choice of  task domain in which  to
study inductive inference. As Poincare' [13] says,


The genesis of mathematical creation  is a problem which should  intensely
interest the psychologist.  It is  the activity  in which  the human  mind
seems to take least from the outside  world, in which it acts or seems  to
act only of itself  and on itself,  so that in  studying the procedure  of
mathematical thought we may hope to reach what is most essential in  man's
mind.

There are several more concrete reasons supporting this:


1. In doing  math research, one  needn't cope with  the uncertainties  and
fallability of testing equipment; that  is, there are no uncertainties  in
the data  (compared  to, e.g.,  molecular  structure inference  from  mass
spectrograms).

2. Reliance  on  experts'  introspections  is one  of  the  most  powerful
techniques for codifying the judgmental criteria necessary to do effective
work in a field; by limiting the program to areas of mathematics in  which
the programmer  is competent,  he  needn't rely  on external  sources  for
guidance in  formulating such  heuristic rules.   Also, several  excellent
sources are available [13,14,15,21,22,25].

3. The more  formal a  science is,  the easier it  is to  automate. For  a
machine to carry out research  in psychology would require more  knowledge
about human information processing than  now is known, because  psychology
deals with entities as complex  as you and I.  Also, in a formal  science,
the %blanguages%* to communicate information can be simple even though the
%bmessages%* themselves be sophisticated.

4.   Since  mathematics  can  deal  with  any  conceivable  constructs,  a
researcher there is not limited  to explaining observed data.  Related  to
this is the freedom  to investigate --  or to give up  on -- whatever  the
researcher wants to. There is no single discovery which is the "goal",  no
given problem to solve, no right or wrong behavior.

5. Unlike  "simpler" fields,  such  as propositional  logic, there  is  an
abundance of heuristic rules available for the picking.



The limitations  of math  as a  domain are  closely intertwined  with  its
advantages.  Having  no  ties  to  real-world data  can  be  viewed  as  a
limitation, as can having no clear  goal. There is always the danger  that
AM will give up on each theory  as soon as the first tough obstacle  crops
up.

We shall now examine Mathematics research as a task, and then describe AM,
a computer program which carries out just this sort of activity.


.sec(Plausible Reasoning in Mathematics)

The inadequacy of formal  deductive methods in  mathematics has long  been
argued   [13,14,15],    often    lamented   in    the    conclusions    to
resolution-theorem-proving articles  [16,17].  An  early use  of  analogic
models in  geometry-theorem  proving  was  quite  successful  [18].  Later
attempts at the introduction of heuristics into resolution theorem-provers
include [16,19,22].   It is  hypothesized that  the limited  successes  of
these schemes was due to the relatively small number of -- and variety  of
-- heuristics  they  possessed,  and  the fact  that  resolution  was  the
dominant driving force in the program.  The reason for the small number of
heuristics is simply the genuine paucity of informal rules of thumb  which
exist for the very  general environment in  which those programs  operate:
domain-independent (asemantic) predicate calculus theorem-proving.

Recently, Lenat has completed a thesis [3] about the mechanization of  the
"other  half"  of  mathematical  activity,  apart  from  proving:  namely,
defining new concepts and noticing plausible conjectures.  While his  "AM"
system had no proof  capabilities, the important point  was how it  viewed
Mathematics. Below is the model of  math research that AM was based  upon.
It has  been pieced  together out  of the  writings of  Poincare',  Polya,
Lakatos, Hadamard, and others:

.begin; spacing 0; preface 1; indent 0,3,0; group; once center; select b; preface 1;
↓_MODEL  OF  MATH  RESEARCH_↓


1. The order in which a math textbook presents a theory is almost the
exact opposite  of the order in which  it was actually discovered and
developed.  In a text, new  definitions are stated with little or  no
motivation, and they turn out to be just the ones needed to state the
next big  theorem, whose proof then magically appears. In contrast, a
mathematician  doing   research  will   examine  some   already-known
concepts, perhaps trying to find some regularity in experimental data
involving them. The patterns he  notices are the conjectures he  must
investigate further, and these relationships directly motivate him to
make new definitions.

.apart;

2.  Each step  the  researcher takes  while developing  a  new theory
involves choosing from  a large set of  "legal" alternatives --  that
is,  searching.   The  key to  keeping this  from  becoming a  blind,
explosive  search  is the  proper  use of  evaluation  criteria. Each
mathematician uses his own  personal heuristics to choose  the "best"
alternative available at each moment.

3.   Non-formal   criteria   (aesthetic  interestingness,   inductive
inference from  empirical evidence,  analogy, and  utility) are  much
more  important   than   formal  deductive   methods  in   developing
mathematically   worthwhile   theories,   and   in  avoiding   barren
diversions.

4. Progress in %bany%*  field of mathematics demands much  non-formal
heuristic  expertise  in  %bmany%*  different  "nearby"  mathematical
fields.  So a  broad, universal  core of  knowledge must  be mastered
before any single theory can meaningfully be developed.

5. It is sufficient (and  pragmatically necessary) to have and  use a
large  set  of  informal  heuristic  rules. These  rules  direct  the
researcher's next activities, depending  on the current situation  he
is  in.   These rules  can  be assumed  to  superimpose ideally:  the
combined  effect of several rules  is just the sum  of the individual
effects.

6. The  necessary  heuristic rules  are  virtually  the same  in  all
branches of mathematics,  and at all levels of  sophistication.  Each
specialized  field will  have some of  its own  heuristics; those are
normally much more powerful than the general-purpose heuristics.

7. For true understanding, the researcher should  grasp
(have access
to,  relate to,  store,  be able  to  manipulate, be  able to  answer
questions about)    each  concept  in  several  ways:  declaratively,
abstractly, operationally,  knowing  when it  is relevant,  and as  a
bunch of examples.

8. Common metaphysical  assumptions about nature and  science: Nature
is   fair,  uniform,  and   regular.     Coincidences  have  meaning.
Statistical considerations  are valid  when looking  at  mathematical
data.   Simplicity and  symmetry and  synergy are  the rule,  not the
exception.

.end; skip 2;

Let us try to motivate some of these assumptions, by elaborating
on how (in our view) mathematical discoveries are made.




.sec(Discovery in Mathematics);

Before  discussing  how  to  %bsynthesize%*  a  new  mathematical  theory,
consider briefly  how to  %banalyze%* one,  how to  construct a  plausible
chain of reasoning which stretches from a given discovery all the way back
to well-known concepts.

.subsec(Analysis of a Discovery);

One can rationalize a  given discovery by  working backwards, by  reducing
the creative  act to  simpler  and simpler  creative acts.   For  example,
consider the concept  of prime numbers.   How might one  be led to  define
such a notion,  if he'd never  heard of it  before?  Notice the  following
plausible strategy:

.ONCE spacing 0; INDENT 9,9,9 SELECT b;

"If f is a function which transforms elements of A into elements of B, and
B is ordered, then consider just those members of A which are  transformed
into ↓_extremal_↓ elements of B.  This set is an interesting subset of  A.
Name it and study it."

When f(x) means  "divisors of x",  and the ordering  is "by length",  this
heuristic says to consider  those numbers which have  a minimal number  of
factors -- that  is, the primes.   So this rule  actually %breduces%*  our
task from "proposing the concept of prime numbers" to two more  elementary
problems: "discovering ordering-by-length" and "inventing divisors-of".

But suppose we know this general rule: %b"If f is an interesting function,
consider its inverse."%* It reduces the task of discovering divisors-of to
the simpler  task of  discovering multiplication.   Eventually, this  task
reduces to  the  discovery  of  very  basic  notions,  like  substitution,
set-union, and equality.   To explain  how a given  researcher might  have
made a given  discovery, such  an analysis  must be  continued until  that
inductive task is  reduced to "discovering"  notions which the  researcher
already knew, which were his conceptual primitives.

.subsec(Syntheses of Discoveries)


Suppose  a  large  collection  of  these  heuristic  strategies  has  been
assembled (e.g., by analyzing a  great many discoveries, and writing  down
new heuristic  rules  whenever  necessary).   Instead  of  using  them  to
%bexplain%* how a given idea might have evolved, one can imagine  starting
from  a  basic  core  of   knowledge  and  "running"  the  heuristics   to
%bgenerate%* new  concepts.  We're  talking  about reversing  the  process
described in the last section: not how to %bexplain%* discoveries, but how
to %bmake%* them.

Notice that this forward  search is much  "bushier", much more  explosive,
than was the backwards analysis previously described.  So it's much harder
to actually make a  discovery than to rationalize  -- by hindsight --  how
one might  have  made  it.   We all  have  noticed  this  phenomenon,  the
"Why-didn't-I-think-of-that-sooner!!"  feeling.

The forward search is too explosive; we may hypothesize that the scientist
employs some kind of informal rules of thumb to constrain it. That is,  he
doesn't really follow  rules like  %b"Look at  the inverse  of each  known
function f"%a,  because that  would take  up too  much time.  Rather,  his
heuristic  rules   might  be   more   naturally  stated   as   productions
(condition/action rules)  like  this:  %b"If  f is  1-1  and  Range(f)  <<
Domain(f), Then look at f-inverse.%a" Henceforth, ↓_"%bheuristic rule%a"_↓
will mean a conditional rule of  thumb. In any particular situation,  some
subset of these rules will "trigger", and will suggest a manageable  space
of plausible  activities to  perform.  After  exploring that  space for  a
while, the situation  will have changed,  and the cycle  will begin  anew.
This double  layer of  heuristic  search shouldn't  bother anyone;  it  is
analogous to performing double induction instead of standard  mathematical
induction.

.chap(|DESIGN OF THE ↓_AM_↓ PROGRAM|);

Such syntheses are  precisely what  AM -- and  a scientist  -- does.   The
program consists of  a large  corpus of  primitive mathematical  concepts,
each  with  a  few  associated  heuristics.   Each  such  heuristic  is  a
situation/action  rule  which  functions   as  a  local  "plausible   move
generator".  Some suggest tasks for the system to carry out, some  suggest
ways of satisfying a given task, etc.  AM's activities all serve to expand
AM itself, to  enlarge upon a  given body of  mathematical knowledge.   AM
uses its heuristics  as judgmental  criteria to guide  development in  the
most promising direction.

.sec(Representation);

Each concept  is  represented  as  a frame-like  data  structure  with  25
different facets  or  slots.  The  types  of facets  include:  %bEXAMPLES,
DEFINITIONS, GENERALIZATIONS, DOMAIN/RANGE, ANALOGIES,  INTERESTINGNESS,%*
and many others.  Modular representation of concepts provides a convenient
scheme for organizing the heuristics; for example, the following  strategy
fits into the %bEXAMPLES%* facet of the %bPREDICATE%* concept:

.once indent 5,5,5; preface 1; spacing 0;

%b"If, empirically, 10 times  as many elements FAIL  some predicate P,  as
SATISFY it, then some ↓_generalization_↓ (weakened version) of P might  be
more interesting than P"%a.

AM considers this  suggestion after  trying to  fill in  examples of  each
predicate.   In   fact,   after   AM  attempts   to   find   examples   of
%bSET-EQUALITY%a, so  few are  found that  AM decides  to generalize  that
predicate.  The result is the creation  of several new predicates, one  of
which happens  to mean  "Has-the-same-length-as"  -- i.e.,  a  rudimentary
precursor to natural numbers.

Below is a typical concept, PRIMES, in  a state long after AM defined  and
explored it.

.begin nofill indent 0 preface 0

.group;
NAME: Prime Numbers 
DEFINITIONS: 
		ORIGIN: Number-of-divisors-of(x) = 2 
		PREDICATE-CALCULUS: Prime(x) ≡ (∀z)(z|x α→  z=1 α⊗ z=x) 
		ITERATIVE: (for x>1): For i from 2 to Sqrt(x), ¬(i|x) 

EXAMPLES: 2, 3, 5, 7, 11, 13, 17 
		BOUNDARY: 2, 3 
		BOUNDARY-FAILURES: 0, 1 
		FAILURES: 12 

GENERALIZATIONS: Nos., Nos. with an even no. of divisors, Nos. with a prime no. of divisors 

SPECIALIZATIONS: Odd Primes, Prime Pairs, Prime Uniquely-addables 

CONJECS: Unique factorization, Goldbach's conjecture, Extremes of Number-of-divisors-of 

INTU'S: %bA metaphor to the effect that Primes are the building blocks of all numbers%* 

ANALOGIES: 
		Maximally-divisible numbers are converse extremes of Number-of-divisors-of 
		Factor a non-simple group into simple groups 

INTEREST: Conjectures tying Primes to TIMES, to Divisors-of, to closely related operations 

WORTH: 800 

.apart; end; skip 2;

"Creating a  new  concept" is  a well-defined  activity: it  involves
setting  up a new  data structure like the one above,  and filling in
entries for some  of its facets  or slots.   Filling in a  particular
facet  of a  particular concept  is also  quite well-defined,  and is
accomplished  by executing a collection  of relevant heuristic rules.

.sec(Control);


AM is initially given a collection of  115 core concepts, with only a  few
facets filled in for each.  Its sole  activity is to choose some facet  of
some concept, and fill in that particular slot.  In so doing, new  notions
will often emerge.  Uninteresting  ones are forgotten, mildly  interesting
ones are kept as parts of one  facet of one concept, and very  interesting
ones are granted full concept-module status. Each of these new modules has
dozens of blank slots, hence the  space of possible actions (blank  facets
to fill in) grows rapidly.  The  same heuristics are used both to  suggest
new directions for investigation, and  to limit attention: both to  sprout
and to prune.



The fundamental kind of  task which AM performs,  its basic action, is  to
fill in a given  facet of a  given concept. To decide  which such task  to
work on  next, AM  maintains an  %bagenda%* of  tasks, a  global  job-list
ordered by priority.  A typical task is %b"Fill-in examples of  Primes"%*.
The agenda  may  contain  hundreds  of  entries  such  as  this  one.   AM
repeatedly selects the top task from the agenda and tries to carry it out.
This is the whole control structure! Of course, we must still explain  how
AM creates plausible  new tasks  to place on  the agenda,  how AM  decides
which task will be the best one to execute next, and how it carries out  a
task.

If  the  task  is  %b"Fill  in  new  Algorithms  for  Set-union"%*,   then
%bsatisfying%* it would  mean actually synthesizing  some new  procedures,
some new  LISP code  capable of  forming the  union of  any two  sets.   A
heuristic rule is %brelevant%* to a task iff executing that rule brings AM
closer to satisfying that task.  Relevance is determined a priori by where
the rule  is stored.  A rule  tacked onto  the Domain/range  facet of  the
Compose concept  would  be presumed  relevant  to the  task  %b"Check  the
Domain/range of Insert-o-Delete"%a.

Once a task  is chosen from  the agenda, AM  gathers some heuristic  rules
which might be relevant to satisfying  that task.  They are executed,  and
then AM picks  a new  task.  While  a rule  is executing,  three kinds  of
actions or effects can occur:


(i) Facets of some  concepts can get filled  in (e.g., examples of  primes
may actually be found and tacked onto the "Examples" facet of the "Primes"
concept).  A typical heuristic rule which might have this effect is:


%bTo fill in examples of X, where X is a kind of Y (for some more  general
concept Y),

Check the examples of Y; some of them may be examples of X as well.%a



For the task of  filling in examples  of Primes, this  rule would have  AM
notice that Primes is a  kind of Number, and  therefore look over all  the
known examples of  Number. Some  of those would  be primes,  and would  be
transferred to the Examples facet of Primes.

.skip 3

(ii) New concepts  may be  created (e.g.,  the concept  "primes which  are
uniquely representable as the sum of  two other primes" may be somehow  be
deemed worth studying).  A  typical heuristic rule  which might result  in
this new concept is:

%bIf some (but not most)  examples of X are also  examples of Y (for  some
concept Y),

Create a new concept  defined as the intersection  of those 2 concepts  (X
and Y).%*


Suppose AM has already isolated the concept of being representable as  the
sum of  two  primes  in only  one  way  (AM actually  calls  such  numbers
"Uniquely-prime-addable numbers").  When AM  notices that some primes  are
in this set, the above  rule will create a  brand new concept, defined  as
the set of numbers which are both prime and uniquely prime addable.

.skip 3

(iii) New tasks may be added to the agenda (e.g., the current activity may
suggest that the  following task  is worth  considering:  "Generalize  the
concept of prime  numbers").  A  typical heuristic rule  which might  have
this effect is:

%bIf very few examples of X are found,

Then add the following task to the agenda:  "Generalize the concept X".%*


Of course, AM contains a precise meaning for the phrase "very few".   When
AM looks for primes among examples  of already-known kinds of numbers,  it
will find dozens of non-examples for every example of a prime it uncovers.
"Very few"  is  thus naturally  implemented  as a  statistical  confidence
level.

.skip 3


The concept  of an  agenda is  certainly not  new:  schedulers  have  been
around for a long time.  But  one important feature of AM's agenda  scheme
%bis%* a new  idea:  attaching --  and using --  a list of  quasi-symbolic
reasons to each task which explain why the task is worth considering,  why
it's plausible.   %bIt is  the responsibility  of the  heuristic rules  to
include  reasons  for  any  tasks  they  propose.%a  For  example,   let's
reconsider the heuristic rule mentioned  in (iii) above.  It really  looks
more like the following:


%bIf very few examples of X are found,

Then add the following task to the agenda: "Generalize the concept X", for
the following reason:  "X's are  quite rare; a  slightly less  restrictive
concept might be more interesting".%a



If the same  task is  proposed by  several rules,  then several  different
reasons for it  may be present.   In addition, one  ephemeral reason  also
exists: "Focus of attention". Any tasks which are similar to the one  last
executed get "Focus of  attention" as a bonus  reason.  AM uses all  these
reasons, e.g.   to  decide how  to  rank the  tasks  on the  agenda.   The
"intelligence" AM exhibits is not so  much "what it does", but rather  the
%border%* in which  it arranges  its agenda.  For  example, alternating  a
randomly-chosen task and  the "best" task  (the one AM  chose to do)  only
slows the  system down  by a  factor of  2, yet  it totally  destroys  its
credibility as a rational researcher (as judged by the human user of  AM).
This is one conclusion of an experiment performed upon AM.

AM uses the list of reasons in another way: Once a task has been selected,
the quality of the reasons is used  to decide how much time and space  the
task will be permitted to  absorb, before AM quits and  moves on to a  new
task.


Executing a  task is  achieved by  locating relevant  rules of  thumb  and
evaluating them. The  location is  performed efficiently  because all  the
concepts are related by  generalization/specialization links, and  because
the following key heritability property holds:  Any method for filling  in
facet f  of concept  C  will also  work  for filling  in  facet f  of  any
specialization  of  C.   Thus   when  the  task   "Fill  in  examples   of
%bSET-EQUALITY%a"   is   chosen,   AM   asks   each   generalization    of
%bSET-EQUALITY%a for help. Thus  it asks for ways  to fill in examples  of
any Predicate, any Activity, any Concept, and finally for ways to fill  in
examples of  Anything.  One  such  heuristic rule  known to  the  Activity
concept says: "Actually execute the activity on some random members of its
domain." Thus to fill in examples of %bSET-EQUALITY%a, its domain facet is
inspected, and AM sees it takes a  pair of objects as its arguments.  Then
AM accesses the Examples facet of the concept %bOBJECT%a, where it finds a
large list of objects. Obeying the  heuristic rule, AM repeatedly picks  a
pair of them  at random,  and sees  if they  satisfy %bSET-EQUALITY%a  (by
actually running  the LISP  function  stored in  the Algorithms  facet  of
%bSET-EQUALITY%a).  While  this  will  typically  return  False,  it  will
occasionally locate -- by random chance -- a pair of equal sets.

Other heuristics, tacked onto  other generalizations of  %bSET-EQUALITY%a,
provide additional methods  for executing  the task "Fill  in examples  of
%bSET-EQUALITY%a.  A heuristic stored on the concept %bANY-CONCEPT%a  says
to symbolically instantiate the  definition. A bag  of tricks is  provided
for this  purpose,  one  of  which ("instantiate  the  base  step  of  the
recursion")  works  nicely  on  the  recursive  definition  provided   for
%bSET-EQUALITY%a.  Notice that, as one might expect, the more general  the
concept is, the weaker (more time-consuming) its heuristics are.  For this
reason,  AM  consults  each  concept's   rules  in  order  of   increasing
generalization.

.chap(RESULTS);


.sec(An Excerpt)


The purpose of  this example  is to  convey a  bit of  AM's flavor.  After
reading through it, the  reader should be convinced  that AM is %bnot%*  a
theorem-prover, nor is it %brandomly%* manipulating entries in a knowledge
base, nor  is  it  %bexhaustively%*  manipulating  or  searching.   AM  is
carefully growing a network  of data structures representing  mathematical
concepts, by repeatedly using heuristics both (a) for guidance in choosing
a task to work on next, and  (b) to provide methods to satisfy the  chosen
task.


The following  points are  important but  can't be  conveyed by  any  lone
example:


(i) Although AM  appears to  have reasonable  natural language  abilities,
very little effort was expended in this area.

(ii) It is important to ask how general the program is:  is the  knowledge
base "just  right"  (i.e.,  finely  tuned to  elicit  this  one  chain  of
behaviors)?  The answer is "%bNo%*":  The whole point of this project  was
to show that  a relatively  small set of  general heuristics  can guide  a
nontrivial discovery process.  Each activity,  each task, was proposed  by
some heuristic rule (like  "look for extreme cases  of X") which was  used
time and time again,  in many situations.  It  was not considered fair  to
insert heuristic guidance which could only "guide" in a single  situation.
For example,  the same  heuristics which  lead AM  to decomposing  numbers
(using TIMES-inverse) and thereby  discovering unique factorization,  also
leads to decomposing numbers  (using ADD-inverse) and thereby  discovering
Goldbach's conjecture.





Now here is the excerpt itself:


** ↓_TASK 65:_↓ **   Fill in Examples of the concept "Divisors-of".

\3 Reasons:\(1) No known examples of Divisors-of so far.
\\(2) TIMES, which is related to Divisors-of, is now very interesting.
\\(3) Focus of attention: AM recently defined Divisors-of.

26 examples found, in 9.2 seconds. e.g., Divisors-of(6)=α{1 2 3 6α}.


** ↓_TASK 66:_↓ **  Consider numbers having small sets of Divisors-of.

\2 Reasons:\(1) Worthwhile to look for extreme cases.
\\(2) Focus of attention: AM recently worked on Divisors-of.

Filling in examples of numbers with 0 divisors.
   0 examples found, in 4.0 seconds.
   Conjecture: no numbers have precisely 0 divisors.

Filling in examples of numbers with 1 divisors.
   1 examples found, in 4.0 seconds. e.g., Divisors-of(1) = α{1α}.
   Conjecture: 1 is the only number with precisely 1 divisor.

Filling in examples of numbers with 2 divisors.
   24 examples found, in 4.0 seconds. e.g., Divisors-of(13) = α{1 13α}.
   No obvious conjecture. May merit more study.
   Creating a new concept: "Numbers-with-2-divisors".

Filling in examples of numbers with 3 divisors.
   11 examples found, in 4.0 seconds. e.g., Divisors-of(49) = α{1 7 49α}.
   All numbers with 3 divisors are also Squares. Definitely merits more study.
   Creating a new concept: "Numbers-with-3-divisors".


** ↓_TASK 67:_↓ **  Consider the square-roots of Numbers-with-3-divisors.

\2 Reasons:\(1) Numbers-with-3-divisors are unexpectedly also Perfect Squares.
\\(2) Focus of attention: AM recently worked on Nos-with-3-divisors.

   All square-roots of Numbers-with-3-divisors seem to be Numbers-with-2-divisors.
        e.g.,  Divisors-of(Square-root(169)) = Divisors-of(13) = α{1 13α}.
   Even the converse of this seems empirically to be true.
        i.e.,  the square of each No-with-2-divisors seems to be a No-with-3-divisors.
        The chance of coincidence is below acceptable limits.
   Boosting the interestingness rating of each of the concepts involved.


** ↓_TASK 68:_↓ **  Consider the squares of Numbers-with-3-divisors.

\3 Reasons:\(1) Squares of Numbers-with-2-divisors were interesting.
\\(2) Square-roots of Numbers-with-3-divisors were interesting.
\\(3) Focus of attention: AM recently worked on Nos-with-3-divisors.


.sec(AM as a Mathematician)

Now that we've  seen how  AM works,  and we've been  exposed to  a bit  of
"local" results,  let's take  a  moment to  discuss  the totality  of  the
mathematics  which  AM  carried   out.   Like  a  contemporary   historian
summarizing the work of the Babylonian mathematicians, we shan't  hesitate
to use current terms and criticize by current standards.

AM began its investigations with  scanty knowledge of a few  set-theoretic
concepts.  Most of  the obvious  set-theory relations  (e.g., de  Morgan's
laws) were eventually uncovered; since AM never fully understood  abstract
algebra, the  statement  and  verification  of each  of  these  was  quite
obscure.  AM never  derived a formal  notion of infinity,  but it  naively
established conjectures like "a set can never be a member of itself",  and
procedures for making chains of new sets ("insert a set into itself").  No
sophisticated set theory (e.g., diagonalization) was ever done.

After this initial period of  exploration, AM decided that "equality"  was
worth generalizing, and  thereby discovered  the relation  "same-size-as".
"Natural numbers"  were based  on this,  and soon  most simple  arithmetic
operations were defined.

Since addition  arose as  an  analog to  union,  and multiplication  as  a
repeated substitution, it came  as quite a surprise  when AM noticed  that
they  were  related   (namely,  N+N  =   2xN).   AM  later   re-discovered
multiplication in three other ways:  as repeated addition, as the  numeric
analog of  the Cartesian  product of  seion of  square-root.  AM  remained
content  to  play  around  with  the  concept  of  %binteger]-square-root.
Although it isolated the set of numbers  which had no square root, AM  was
never close to discovering rationals, let alone irrationals. No notion  of
"closure" was provided to -- or discovered by -- AM.

Raising to  fourth-powers, and  fourth-rooting,  were discovered  at  this
time.  Perfect  squares and  perfect  fourth-powers were  isolated.   Many
other numeric  operations  and  kinds  of numbers  were  found  to  be  of
interest:   Odds,  Evens,  Doubling,  Halving,  Integer-square-root,  etc.
Although it isolated the set of numbers  which had no square root, AM  was
never close to discovering rationals, let alone irrationals. No notion  of
"closure" was provided to -- or discovered by -- AM.


The associativity and  commutativity of multiplication  indicated that  it
could accept  a BAG  of numbers  as  its argument.   When AM  defined  the
inverse operation  corresponding  to  Times,  this  property  allowed  the
definition to be: "any bag of numbers (>1) whose product is x".  This  was
just the notion  of factoring  a number  x.  Minimally-factorable  numbers
turned out to be what  we call primes.  Maximally-factorable numbers  were
also thought to be interesting.

Prime pairs were discovered  in a bizarre way:  by restricting the  domain
and range of addition to primes (i.e., solutions of p+q=r in primes).


AM conjectured the fundamental theorem of arithmetic (unique factorization
into primes) and Goldbach's conjecture (every even number >2 is the sum of
two primes) in a surprisingly symmetric way.  The unary representation  of
numbers gave way to a representation as  a bag of primes (based on  unique
factorization), but AM never thought  of exponential notation.  Since  the
key concepts  of remainder,  greater-than,  gcd, and  exponentiation  were
never mastered, progress in number theory was arrested.

When a new base of %bgeometric%a concepts was added, AM began finding some
more general associations.   In place  of the strict  definitions for  the
equality of lines, angles, and triangles, came new definitions of concepts
we refer to as  Parallel, Equal-measure, Similar, Congruent,  Translation,
Rotation, plus many which  have no common name  (e.g. the relationship  of
two triangles sharing a common angle).  A cute geometric interpretation of
Goldbach's conjecture was  found [Given all  angles of a  prime number  of
degrees, (0,1,2,3,5,7,11,...,179 degrees),  then any angle  between 0  and
180 degrees can be approximated (to within 1 degree) as the sum of two  of
those angles.]   Lacking a  geometry "model"  (an analogic  representation
like the  one Gelernter  employed [18]),  AM was  doomed to  propose  many
implausible geometric conjectures.



.sec(|Limitations of ↓_AM_↓|);

Although  AM  fared  well  according  to  several  different  measures  of
performance, of  greatest significance  from  the point  of view  of  this
handbook are its %blimitations%a.


As AM ran  longer and  longer, the concepts  it defined  were further  and
further from the primitives it began with. Thus "prime-pairs" were defined
using "primes"  and  "addition", the  former  of which  was  defined  from
"divisors-of", which in turn came from "multiplication", which arose  from
"addition",  which  was  defined  as  a  restriction  of  "union",   which
(finally!) was a primitive supplied to AM initially. When AM  subsequently
needed help with  prime pairs, it  was forced  to rely on  rules of  thumb
supplied initially about %bunion%aing. Although the heritability  property
of heuristics did ensure  that those rules were  still valid, the  trouble
was that they  were too  general, too weak  to deal  effectively with  the
specialized notions of primes and arithmetic.

As the derived  concepts moved further  away from finite  set theory,  the
efficacy of  the  initial heuristics  decreased.   AM began  to  "thrash",
appearing to lose most  of its heuristic guidance.  It worked on  concepts
like "prime triples", which may be reasonable to investigate if one  knows
nothing about numbers, but  which is obcene to  one who does.  As  another
example, AM didn't realize that  the unique factorization of numbers  into
primes (UFT)  was  a  much more  significant  conjecture  than  Goldbach's
anomaly.

The key deficiency was that of adequate %bmeta%a-rules[2,3,4]:  heuristics
which cause the creation and modification  of new heuristics. Here is  one
such rule, which would have taken care of the "Goldbach vs. UFT" problem:

%bAfter applying the %a"look at  the inverse of extrema"%b heuristic,  and
thereby defining  a  new concept  C  (as  f-inverse of  b),  Synthesize  a
heuristic  which  indicates  that  conjectures  involving  C  and  f   (or
f-inverse) are very significant and natural, whereas those involving C and
unrelated operations are probably anomalies.%a


How would this meta-rule be used?  When primes are defined as the  inverse
image of  doubletons, under  the  operation "divisors-of",  the  meta-rule
would trigger, and a  new rule would be  synthesized.  That new  heuristic
would say that  conjectures about  primes were natural  iff they  involved
multiplication or division.  Thus the UFT would be rated as important, and
Goldbach's conjecture as cute but useless.


Aside from the preceding  major limitation, most  of the other  complaints
are quite superficial.  Many concepts one  might consider "primitive"  are
missing from AM: proof, tricks for finding counterexamples, numbers,  etc.
Very few of  them are  ever discovered  by AM,  and even  those which  are
discovered will not have any powerful heuristics filled in for them.


Analogies in general were under-utilized. Specifically, analogies  between
heuristics were never even considered.  If one characterizes an analogy as
a  (partial)  correspondence  between  two  collections  of  objects   and
operators, then it is a small  conceptual step to imagine heuristic  rules
which look for and  develop such mappings: the  image of partial  matching
comes immediately  to mind.  AM possessed  a few  domain-independent  such
rules,  and  they  managed  to  produce  some  analogies  (e.g.,   between
multiplication   and    addition;   between    sets/union/same-size    and
numbers/add/equality), but the overall results  were quite meager in  this
area.


.sec(Conclusions about ↓_AM_↓);


The AM project stands as a living demonstration that a few hundred general
heuristic  rules  suffice  to  guide  a  searcher  --  an  automated  math
researcher -- as it  explores and expands a  large but incomplete base  of
math concepts.   AM  shows  that  creative  theoretical  research  can  be
effectively  modelled  as  heuristic  search,  just  as  Meta-Dendral  [1]
establishes a  similar hypothesis  for  the more  constrained,  real-world
field of mass spectroscopy.

AM introduced (1975)  a control structure  based upon an  agenda of  small
plausible research tasks, each with a list of supporting reasons attached.

The main successes were the few novel  ideas it came up with [including  a
new result in  number theory,  dealing with numbers  having very  ↓_many_↓
divisors], the ease with which  new domains were discovered [e.g.,  number
theory] or  introduced  by  hand  [plane  geometry],  and  the  apparently
rational sequences of behavior AM exhibited.

The continuation of this line of research by Lenat is the Eurisko program.
The hypothesis  that  has been  added  is that  the  meta-level  knowledge
required to synthesize and reason about heuristics is a %bsubset%a of  the
knowledge already in AM about  synthesizing and reasoning about  concepts.
That is, Eurisko's meta-rules  are merely some of  the very general  rules
that AM already had.  The only real change, then, from AM to Eurisko is to
re-represent (recode) each  heuristic from Lisp  code into a  full-fledged
concept with facets.  The heuristics, which deal with facets of  concepts,
will then be capable of dealing  with each other.  This work is  currently
in progress at Stanford University.


REFERENCES MENTIONED BY NUMBER IN THE TEXT

.begin fill; indent 0,4,0; tabs 5; turn on "\"; spacing 0; preface 1;

[1]\Bruce  G.  Buchanan,  %bApplications  of  Artificial  Intelligence  to
Scientific Reasoning%a, Second USA-Japan Computer Conference, published by
AFIPS and IPSJ, Tokyo, 1975, pp. 189-194.

[2]\Randall  Davis,  %bApplications  of   Meta  Level  Knowledge  to   the
Construction, Maintenance,  and  Use  of  Large  Knowledge  Bases%a,  SAIL
AIM-271, Artificial  Intelligence Laboratory,  Stanford University,  July,
1976.


[3]\Douglas  B.  Lenat,  %bAM:  An  Artificial  Intelligence  Approach  to
Discovery in Mathematics as  Heuristic Search%a, SAIL AIM-286,  Artificial
Intelligence Laboratory, Stanford University, July, 1976.  Jointly  issued
as Computer Science Dept. Report No. STAN-CS-76-570.

[4]\R. D. Laing, "%bRules  and Metarules%a", in  uu-%bThe Politics of  the
Family and Other Essays%a], Vintage Books, N.Y., 1971, pp. 103-116.

[5]\Herbert Simon and  Kenneth Kotovsky, %bHuman  Acquisition of  Concepts
for Sequential Patterns%a,  Psychology Review 70,  No. 6, November,  1963,
pp. 534-546.


[6]\Malcolm Pivar  and  Mark  Finkelstein, %bAutomation,  using  LISP,  of
Inductive Inference on Sequences%a, in (Berkeley and Bobrow, eds.)  uu-The
Programming Language LISP:  Its Operation  and Applications],  Information
International, Cambridge, Mass., March, 1964.


[7]\C.  G.  Hempel,  %bFundamentals  of  Concept  Formation  in  Empirical
Science%a, University of Chicago Press, Chicago, 1952.

[8]\Patrick  H.   Winston,   %bLearning   Structural   Descriptions   from
Examples%a, EE TR-76, Project MAC TR-231, MIT AI Lab, September, 1970.

[9]\Herbert Simon, %bDoes Scientific Discovery Have a Logic?%a, Philosophy
of Science, 40, No. 4, December, 1973, pp. 471-480.

[10]\Marvin Minsky, %bFrames%a, in (P. Winston, ed.)  uu-The Psychology of
Computer Vision], McGraw Hill, N.Y. 1975.



[11]\Edward  A.  Feigenbaum,  Bruce   Buchanan,  Joshua  Lederberg,   %bOn
Generality and Problem Solving: A Case Study Using the DENDRAL  Program%a,
in (Meltzer and Michie, eds.) Machine Intelligence 6, 1971, pp. 165-190.

[12]\Inductive determination  of the  structure of  proteins by  automatic
processing of  electron  density X-ray  crystollographic  data.  (informal
communications with Dr.  Robert Engelmore and  Prof. Edward Feigenbaum  of
Stanford University.)


[13]\Henri  Poincare',   %bThe  Foundations   of  Science:   Science   and
Hypothesis, The Value of Science, Science and Method%a, The Science Press,
N.Y. 1929.

[14]\G. Polya,  %bMathematics and  Plausible Reasoning%a,  John Wiley  and
Sons, N.Y., (2 vols.) 1954.

[15]\Imre Lakatos,  %bProofs and  Refutations: The  Logic of  Mathematical
Discovery%a, Cambridge U. Press, London, 1976.

[16]\Woody W. Bledsoe, %bSplitting  and Reduction Heuristics in  Automatic
Theorem Proving%a, Artificial Intelligence 2, 1971, p. 73.

[17]\H. Wang,  %bToward Mechanical  Mathematics%a,  IBM J.   Research  and
Development 4, No. 1, January, 1960, pp. 2-22.

[18]\H. Gelernter, %bRealization of a Geometry-Theorem Proving  Machine%a,
in (Feigenbaum and Feldman, eds.) uu-Computers and Thought], McGraw  Hill,
N.Y., 1963, pp. 134-152.

[19]\Robert Kling, %bReasoning by  Analogy with Applications to  Heuristic
Probllem Solving: A Case  Study%a, Memo AIM-147,  CS Dept. Report  CS-216,
Stanford U., August, 1971.

[20]\T.  Evans,  %bA  Program   for  the  Solution  of   Geometric-Analogy
Intelligence Test Questions%a,  in (Minsky,  ed.) uu-Semantic  Information
Processing]. MIT Press, Cambridge, Mass., 1968, pp. 271-353.

[21]\Donald Knuth,  %bSurreal  Numbers%a,  Addison-Wesley,  Reading,Mass.,
1974.


[22]\D.  Brotz,  %bEmbedding  Heuristic  Problem  Solving  Methods  in   a
Mechanical Theorem Prover%a,  Stanford University  Computer Science  Dept.
Report No. STAN-CS-74-443, August, 1974.

[23]\Arthur Koestler, %bThe Act of Creation%a, Dell Pub., N.Y., 1967.

[24]\Seymour Papert,  %bTeaching  Children  to  be  Mathematicians  Versus
Teaching About Mathematics%a, Intl. J. Math Ed. Sci. Technology 3, No.  3,
July-Sept., 1972, pp. 249-262.

[25]\Jaques Hadamard, %bThe  Psychology of Invention  in the  Mathematical
Field%a, Dover Pub., N.Y., 1945.

.end